Previously, we have attempted to explain the decisions our predictive models take with SHAP values. In this report, we will approach the same task with a different method - Local Interpretable Model-agnostic Explanations (LIME). To be specific, we will:
Local Interpretable Model-agnostic Explanations aims to provide explanations for observations by training local surrogate models - simpler, interpretable models which are fitted to approximate the behavior of the black-box model in the vicinity of a given observation. To give an analogy, this is similar to how one can approximate a manifold at a point with the tangent space. Specifically, LIME:
Mathematically, we may rephrase it as $$\hat{g} = \text{arg min}_{g \in G} L(f, g, \pi(x)) + \Omega(g)$$ where $G$ denotes the class of surrogate models (for example, decision trees or linear regression models), $L$ is a loss function which measures the discrepancy between the black-box $f$ and $g$ in the neighborhood $\pi(x)$, and $\Omega(g)$ denotes the penalty for the complexity of $g$ (for example, decision tree depth or the sparsity of a linear model.)
Before we continue, we must remark on a certain aspect of LIME - it is important to choose an appropriate interpretable feature space for the data points. For example, when dealing with images, it would be unhelpful to get some kind of feature importance for the value of a single pixel in an image - we would far rather be interested in more high-level features, such as an object of a given class being present in the image, or having some property (say, being red or being oriented in a particular fashion). Because we are dealing with tabular data, however, and further taking into account what the columns are, it seems unnecessary to further refine the features, as they are already sufficiently high-level and interpretable.
With that clarified, let us look at the output of the lime package:
What we are looking at are:
lime discretizes continuous variables by default), or, to be more specific, the coefficients for the most important features in the local linear model trained on points in the vicinity of the selected observation, with the target being the probability measure;Let us now comment on the results themselves, and how to interpret them. The predicted probability of having a stroke is low (12%), and according to the local model the most significant factor in it being low is the value of age being between 26 and 45 - to be specific, it reduces the probability from the baseline by 22 percentage points; the next one, subject not being ever married, reduces it by 8 percentage points; them not having hypertension reduces it by 6 percentage points, etc.
Let's look at a different example:
Here, the situtation is diametrically different - the predicted probability is 60%, with age being over 61 contributing 40 percentage points to the local surrogate's prediction, being ever married adding 7 percentage points etc.
One interesting thing about LIME is that, because we sample points in the vicinity of the observation, its explanations inherently depend on the random seed. Thus, a question arises - to what extent are the explanations different as the seed differs? Let's investigate it:
0:1:99:We can see that the explanations differ ever so slightly - to point out some of the differences, sometimes the ranks of hypertension=No and ever_married=No are swapped, likewise with glucose levels and BMI values. Whether that's an issue depends on the application, I would say.
Having now two methods in our arsenal, let's see whether (and if, how) the explanations they offer differ.
We will perform the investigation on the following data point:
For them, lime yields a following explanation:
whereas shap outputs the following:
We can see that the results are fairly similar, but different - for example, SHAP attributes less to the lack of hypertension than LIME. Let's look at another example, just to check if this pattern holds:
It would seem (from the admittedly atrociously small sample of two) that there exist some definite differences between the explanations offered by SHAP and LIME.
As for SHAP, we will now look at what the explanations look like for the logistic regression model, and how they differ from the ones for the XGBoost. In this case, it's particularly interesting to see what it will look like, since a local linear approximation to a linear model is, well, the same model, with the caveat that logistic regression is linear in the log-odds, and not probabilities.
In my estimation, "for the most part" the features indicated to be most significant are the same as for the XGBoost model. One difference I've noticed is that the logistic regression seems to put far more stock into different work types - in contrast, for the tree model, it was virtually always negligible.
Now, let's look at the aforementioned "local approx of linear model is linear" hypothesis. One way to check it would be to see if, when trying to explain logits of the probabilities returned by the logistic regression model, we get the same attributions (read: coefficients) for different observations. When doing just that for four previous data points, we get:
The contributions for the same (possibly discretized) features are not the same. One can then wonder what could be the reason behind this - I surmise it's possible that the different data points used to fit the model, as well as the complexity penalty, could play a role in this discrepancy, though I cannot say for sure.